-------------Apple Galaxian------------
A 4am crack                  2020-06-23
-------------------. updated 2020-06-24
                   |___________________

Name: Apple Galaxian
Genre: action
Year: 1981
Credits: Tony Suzuki
Publisher: StarCraft
Platform: Apple ][ (48K)
Media: 5.25-inch disk
Sides: 1
OS: custom
Previous cracks: none (*)

(*) All known cracks of this game are
    based on the Broderbund re-release.
    This original version by StarCraft
    is unpreserved.

                   ~

               Chapter 0
 In Which Various Automated Tools Fail
          In Interesting Ways


COPYA
  immediate disk read error

Locksmith Fast Disk Backup
  unable to read any track

EDD 4 bit copy (no sync, no count)
  errors on tracks $0E-$22
  copy works

Copy ][+ nibble editor
  Tracks $0E-$22 are unformatted
  Lower tracks look like 5-3 encoded
    (13-sector disk) with a different
    address prologue on each track

Passport
  Recognizes and traces the boot but
  fails to read the disk with its own
  RWTS (this is mysterious)

Why didn't COPYA work?
  not a 16-sector disk

Why didn't Locksmith FDB work?
  not a 16-sector disk

EDD worked. What does that tell us?
  no half or quarter tracks
  probably no runtime protection check
  just the altered sector structure
    with a custom loader

Once the game is loaded, it never uses
the disk again. There is no high score
saving and no reload when a game ends.
I think this will be one of those
"capture the game in memory and rebuild
it from the ground up" cracks.

Next steps:

  1. Trace bootloader
  2. Capture game code in memory
  3. Write game to a standard disk and
     build a bootloader to load it
  4. Declare victory (*)

(*) but do not go to the gym until
    there's a vaccine

                   ~

               Chapter 1
      In Which We Brag About Our
           Humble Beginnings


I have two floppy drives, one in slot 6
and the other in slot 5. My "work disk"
(in slot 5) runs Diversi-DOS 64K, which
is compatible with Apple DOS 3.3 but
relocates most of DOS to the language
card on boot. This frees up most of
main memory (only using a single page
at $BF00..$BFFF), which is useful for
loading large files or examining code
that lives in areas typically reserved
for DOS.

[S6,D1=original disk]
[S5,D1=my work disk]

The floppy drive firmware code at $C600
is responsible for aligning the drive
head and reading sector 0 of track 0
into main memory at $0800. Because the
drive can be connected to any slot, the
firmware code can't assume it's loaded
at $C600. If the floppy drive card were
removed from slot 6 and reinstalled in
slot 5, the firmware code would load at
$C500 instead.

To accommodate this, the firmware does
some fancy stack manipulation to detect
where it is in memory (which is a neat
trick, since the 6502 program counter
is not generally accessible). However,
due to space constraints, the detection
code only cares about the lower 4 bits
of the high byte of its own address.

Stay with me, this is all about to come
together and go boom.

$C600 (or $C500, or anywhere in $Cx00)
is read-only memory. I can't change it,
which means I can't stop it from
transferring control to the boot sector
of the disk once it's in memory. BUT!
The disk firmware code works unmodified
at any address. Any address that ends
with $x600 will boot slot 6, including
$B600, $A600, $9600, &c.

; copy drive firmware to $9600
*9600<C600.C6FFM

; and execute it
*9600G
...reboots slot 6, loads game...

Now then:

]PR#5
...
]CALL -151

*9600<C600.C6FFM

*96F8L

96F8-   4C 01 08    JMP   $0801

That's where the disk controller ROM
code ends and the on-disk code begins.
But $9600 is part of read/write memory.
I can change it at will. So I can
interrupt the boot process after the
drive firmware loads the boot sector
from the disk but before it transfers
control to the disk's bootloader.

; instead of jumping to on-disk code,
; copy boot sector to higher memory so
; it survives a reboot
96F8-   A0 00       LDY   #$00
96FA-   B9 00 08    LDA   $0800,Y
96FD-   99 00 28    STA   $2800,Y
9700-   C8          INY
9701-   D0 F7       BNE   $96FA

; turn off slot 6 drive motor
9703-   AD E8 C0    LDA   $C0E8

; reboot to my work disk in slot 5
9706-   4C 00 C5    JMP   $C500

*BSAVE TRACE0,A$9600,L$109
*9600G
...reboots slot 6...
...reboots slot 5...

]BSAVE OBJ.0800-08FF,A$2800,L$100

Now we get to(*) trace the boot process
one sector, one page, one instruction
at a time.

(*) If you replace the words "need to"
    with the words "get to," life
    becomes amazing.

                   ~

               Chapter 2
       Transitions Are Difficult


*801L

0801-   A2 00       LDX   #$00
0803-   BD 00 08    LDA   $0800,X
0806-   9D 00 02    STA   $0200,X
0809-   E8          INX
080A-   D0 F7       BNE   $0803
080C-   4C 0F 02    JMP   $020F
080F-   A0 AB       LDY   #$AB
0811-   98          TYA
0812-   85 3C       STA   $3C
0814-   4A          LSR
0815-   05 3C       ORA   $3C
0817-   C9 FF       CMP   #$FF
0819-   D0 09       BNE   $0824
081B-   C0 D5       CPY   #$D5
081D-   F0 05       BEQ   $0824
081F-   8A          TXA
0820-   99 00 08    STA   $0800,Y
0823-   E8          INX
0824-   C8          INY
0825-   D0 EA       BNE   $0811
...

This is part of a 13-sector / 16-sector
hybrid bootloader that was used on many
different disks circa 1981. The problem
was that the first floppy drives and
drive controllers could only read 13-
sector disks (DOS 3.2). Then Apple
upgraded the drives and controllers to
16-sector (DOS 3.3), which were not
backward compatible.

This left publishers with a conundrum,
since they didn't want to ship two
disks (expensive!) or have customers
specify which kind of drive they had
(error-prone).

Solution: a single disk where track 0
contained BOTH 13-sector and 16-sector
boot sectors. (They used different
address prologues, allowing two sector
0s to coexist on one track.) The older
drives would find the 13-sector code,
and the newer drives would find the
16-sector code. Then the 16-sector code
would load the 13-sector code and jump
to it, and the disk could pretend it
was still a purely 13-sector world.

Transitions are difficult.

Naturally, there were two different
solutions to this problem which did the
same thing but were incompatible with
each other. I think of this one as the
"second" solution based on the order in
which I supported them in Passport, but
I don't actually know which one was
developed first.

Anyway, this code acts like the old 13-
sector drive controller, in that it
loads the 13-sector boot sector and
jumps to $0301.

...
0841-   4C 01 03    JMP   $0301

So that is where I get to interrupt the
boot.

                   ~

               Chapter 3
  In Which The Tools Do Not Save Us,
     And Actually Present Us With
        Further Mysteries Which
      Our Obsessive Completionism
         Requires Us To Solve


*9600<C600.C6FFM

; set up callback when the original
; disk would have jumped to $0301
96F8-   A9 05       LDA   #$05
96FA-   8D 42 08    STA   $0842
96FD-   A9 97       LDA   #$97
96FF-   8D 43 08    STA   $0843

; start the boot
9702-   4C 01 08    JMP   $0801

; callback is here -- copy code on page
; three to higher memory so it survives
; a reboot
9705-   A0 00       LDY   #$00
9707-   B9 00 03    LDA   $0300,Y
970A-   99 00 23    STA   $2300,Y
970D-   C8          INY
970E-   D0 F7       BNE   $9707

; turn off the drive motor
9710-   AD E8 C0    LDA   $C0E8

; reboot to my work disk in slot 5
9713-   4C 00 C5    JMP   $C500

*BSAVE TRACE1,A$9600,L$116

*9600G
...reboots slot 6...
...reboots slot 5...

]BSAVE OBJ.0300-03FF,A$2300,L$100
]CALL -151

*2301L

2301-   B9 00 08    LDA   $0800,Y
2304-   0A          ASL
2305-   0A          ASL
2306-   0A          ASL
2307-   99 00 08    STA   $0800,Y
230A-   C8          INY
230B-   D0 F4       BNE   $2301
230D-   A6 2B       LDX   $2B
230F-   A9 09       LDA   #$09
2311-   85 27       STA   $27
2313-   AD CC 03    LDA   $03CC
2316-   85 41       STA   $41
2318-   84 40       STY   $40

Again, I can recognize this as standard
code used on multiple disks. This is
the second stage bootloader which loads
RWTS into higher memory (usually $3600
or $B600 -- the high byte is loaded
from $03CC) and jumps to it.

*23CC

23CC- B6

$B600 it is.

The sector read loop ends when the
counter in $03FF reaches 0, then it
branches to $033A to set up the jump
to the next stage.

232D-   A5 3D       LDA   $3D
232F-   4D FF 03    EOR   $03FF
2332-   F0 06       BEQ   $233A
2334-   E6 41       INC   $41
2336-   E6 3D       INC   $3D
2338-   D0 ED       BNE   $2327
233A-   85 3E       STA   $3E
233C-   AD CC 03    LDA   $03CC
233F-   85 3F       STA   $3F
2341-   E6 3F       INC   $3F
2343-   6C 3E 00    JMP   ($003E)

You'd think I could patch $0343, but
this is actually used by the sector
read loop to point to $C65C, the entry
point inside the drive controller
firmware to read an arbitrary sector.
But anywhere in $033A..$0342 is safe,
as that's only used after the loop
exits.

*9600<C600.C6FFM

; set up callback #1 (same as previous)
96F8-   A9 05       LDA   #$05
96FA-   8D 42 08    STA   $0842
96FD-   A9 97       LDA   #$97
96FF-   8D 43 08    STA   $0843

; start the boot
9702-   4C 01 08    JMP   $0801

; callback #1 is here --
; set up callback #2 after RWTS is in
; memory
9705-   A9 4C       LDA   #$4C
9707-   8D 3C 03    STA   $033C
970A-   A9 17       LDA   #$17
970C-   8D 3D 03    STA   $033D
970F-   A9 97       LDA   #$97
9711-   8D 3E 03    STA   $033E

; continue the boot
9714-   4C 01 03    JMP   $0301

; callback #2 is here --
; copy RWTS to lower memory so it
; survives a reboot to my work disk
9717-   A2 0A       LDX   #$0A
9719-   A0 00       LDY   #$00
971B-   B9 00 B6    LDA   $B600,Y
971E-   99 00 26    STA   $2600,Y
9721-   C8          INY
9722-   D0 F7       BNE   $971B
9724-   EE 1D 97    INC   $971D
9727-   EE 20 97    INC   $9720
972A-   CA          DEX
972B-   D0 EE       BNE   $971B

; turn off drive motor and reboot to my
; work disk
972D-   AD E8 C0    LDA   $C0E8
9730-   4C 00 C5    JMP   $C500

*BSAVE TRACE2,A$9600,L$133

*9600G
...reboots slot 6...
...reboots slot 5...

]BSAVE OBJ.B600-BFFF,A$2600,L$A00
]CALL -151

Execution continues at $B700 (in memory
at $2700).

*2700L

2700-   20 5C B7    JSR   $B75C

*275CL

; turn on hi-res-graphics page
; (uninitialized)
275C-   AD 52 C0    LDA   $C052
275F-   8E E9 B7    STX   $B7E9
2762-   AD 57 C0    LDA   $C057
2765-   AD 50 C0    LDA   $C050
2768-   60          RTS

Continuing from $B703...

2703-   8E F7 B7    STX   $B7F7
2706-   A9 01       LDA   #$01
2708-   8D F8 B7    STA   $B7F8
270B-   8D EA B7    STA   $B7EA
270E-   AD E0 B7    LDA   $B7E0
2711-   8D E1 B7    STA   $B7E1
2714-   A9 01       LDA   #$01
2716-   8D EC B7    STA   $B7EC
2719-   AD E2 B7    LDA   $B7E2
271C-   8D ED B7    STA   $B7ED
271F-   AD E3 B7    LDA   $B7E3
2722-   8D F1 B7    STA   $B7F1
2725-   A9 01       LDA   #$01
2727-   8D F4 B7    STA   $B7F4

This is all so close to a normal RWTS
that I'm wondering where the protection
is. The sector count is in $B7E0,
copied to $B7E1. The start track is 1,
set at $B716. The start sector is in
$B7E2, copied to $B7ED. The start
address is in $B7E3, copied to $B7F1.

*27E0

27E0- 9C

*27E2

27E2- 00

*27E3

27E3- 04

So we're reading $9C sectors, starting
at T01,S00, into $0400+. (I'm assuming
the address goes up. If it went down,
we'd run out of writeable pages pretty
quickly.) That sounds like the whole
game.

Continuing from $B72A...

; set up regular RWTS globals
272A-   8A          TXA
272B-   4A          LSR
272C-   4A          LSR
272D-   4A          LSR
272E-   4A          LSR
272F-   AA          TAX
2730-   A9 00       LDA   #$00
2732-   9D F8 04    STA   $04F8,X
2735-   9D 78 04    STA   $0478,X

; execute the multi-sector, multi-track
; read we've laid out
2738-   20 93 B7    JSR   $B793

; reset the stack
273B-   A2 FF       LDX   #$FF
273D-   9A          TXS
273E-   8E EB B7    STX   $B7EB

; PR#0 & IN#0
2741-   20 93 FE    JSR   $FE93
2744-   20 89 FE    JSR   $FE89

; wipe RWTS from memory
2747-   A0 00       LDY   #$00
2749-   99 00 BF    STA   $BF00,Y
274C-   C8          INY
274D-   D0 FA       BNE   $2749
274F-   CE 4B B7    DEC   $B74B
2752-   AD 4B B7    LDA   $B74B
2755-   C9 B7       CMP   #$B7
2757-   D0 F0       BNE   $2749

; jump to game entry point
2759-   4C 00 06    JMP   $0600

That seems almost... prosaic? I mean,
I'm not complaining that the boot code
is straightforward to trace. (Fine, I'm
complaining a little bit.) But why
couldn't Passport crack this disk? It
did all the tracing we just did by hand
in the blink of an eye, but then it
couldn't read the disk. But this disk
can read itself.

What's the difference?

Drilling into the RWTS further, the
multi-sector read loop at $B793 looks
absolutely standard. It increments
sectors and addresses and wraps around
to sector 0 on new tracks, as usual.
(On 13-sector disks, it was more
efficient to read sectors in ascending
order. On 16-sector disks, it was the
reverse!)

But at $B7B5, the high-level entry
point for reading a single sector, the
difference reveals itself:

*27B5L

27B5-   08          PHP
27B6-   78          SEI
27B7-   20 00 B8    JSR   $B800
27BA-   B0 03       BCS   $27BF
27BC-   28          PLP
27BD-   18          CLC
27BE-   60          RTS
27BF-   28          PLP
27C0-   38          SEC
27C1-   60          RTS

Do you see it? It's calling $B800. A
standard 13-sector RWTS would call
$BD00.

*2800L

; save state
2800-   48          PHA
2801-   08          PHP

; get current track number
2802-   AD EC B7    LDA   $B7EC

; track 0 skips the special treatment
2805-   F0 0A       BEQ   $2811

; on any other track, munge the track
; in some weird way
2807-   49 1E       EOR   #$1E
2809-   09 AA       ORA   #$AA

; and store it inside the RWTS
280B-   8D 80 B9    STA   $B980
280E-   EA          NOP
280F-   EA          NOP
2810-   EA          NOP

; restore state (track 0 also continues
; here)
2811-   28          PLP
2812-   68          PLA

; call actual RWTS entry point
2813-   4C 00 BD    JMP   $BD00

$B980 is here, in the middle of the
code to find the address prologue:

2970-   BD 8C C0    LDA   $C08C,X
2973-   10 FB       BPL   $2970
2975-   C9 D5       CMP   #$D5
2977-   D0 F0       BNE   $2969
2979-   EA          NOP
297A-   BD 8C C0    LDA   $C08C,X
297D-   10 FB       BPL   $297A
297F-   C9 AA       CMP   #$AA    <-- !
2981-   D0 F2       BNE   $2975
2983-   A0 03       LDY   #$03
2985-   BD 8C C0    LDA   $C08C,X
2988-   10 FB       BPL   $2985
298A-   C9 B5       CMP   #$B5
298C-   D0 E7       BNE   $2975

That's it; that's the whole protection.
It's what we saw in the Copy ][+ nibble
editor: a standard 13-sector disk with
a per-track rotating address prologue.
It also explains why Passport couldn't
read the disk. The RWTS requires a
non-standard entry point ($B800), but
Passport always uses the standard entry
point ($BD00). Without the per-track
change to the address prologue matching
code, the disk is unreadable.

                   ~

               Chapter 4
   I Want It All, And I Want It Now


Once more into the breach. But this
time, I'll put the trace program at
$A600 instead of $9600 so I can let the
game load exactly where it wants to
load (up to $9FFF) before capturing it.
Remember, any $x600 works for tracing
the boot. My work disk moves DOS to the
language card, so I'm not overwriting
anything in main memory.

*A600<C600.C6FFM

That does feel weird to type though.
Muscle memory and all that.

; set up callback #1 and start the boot
; (same as previous)
A6F8-   A9 05       LDA   #$05
A6FA-   8D 42 08    STA   $0842
A6FD-   A9 A7       LDA   #$A7
A6FF-   8D 43 08    STA   $0843
A702-   4C 01 08    JMP   $0801

; callback #1 -- set up callback #2
; (same as previous)
A705-   A9 4C       LDA   #$4C
A707-   8D 3C 03    STA   $033C
A70A-   A9 17       LDA   #$17
A70C-   8D 3D 03    STA   $033D
A70F-   A9 A7       LDA   #$A7
A711-   8D 3E 03    STA   $033E
A714-   4C 01 03    JMP   $0301

; callback #2 -- put an RTS after the
; multi-sector read
A717-   A9 60       LDA   #$60
A719-   8D 3B B7    STA   $B73B

; call the original loader (will return
; gracefully because of the RTS -- man,
; the first time I saw qkumba use this
; technique, I audibly squealed)
A71C-   20 00 B7    JSR   $B700

; copy lower memory (text page and page
; 8) to higher memory so it survives a
; reboot
A71F-   A2 05       LDX   #$05
A721-   A0 00       LDY   #$00
A723-   B9 00 04    LDA   $0400,Y
A726-   99 00 A0    STA   $A000,Y
A729-   C8          INY
A72A-   D0 F7       BNE   $A723
A72C-   EE 25 A7    INC   $A725
A72F-   EE 28 A7    INC   $A728
A732-   CA          DEX
A733-   D0 EE       BNE   $A723

; turn off slot 6 drive motor and
; reboot to my work disk
A735-   AD E8 C0    LDA   $C0E8
A738-   4C 00 C5    JMP   $C500

*BSAVE TRACE3,A$A600,L$13B

(I absolutely messed that up the first
time and saved address $9600 instead.
Stupid muscle memory.)

*A600G
...reboots slot 6...
...read read read...
...reboots slot 5...

]BSAVE OBJ.0400-08FF,A$A000,L$500
]BSAVE OBJ.0900-9FFF,A$0900,L$9700

Now we have the entire game code in two
files.

                   ~

               Chapter 5
        Offset For Your Sanity


Now, how should we boot this game from
a standard disk? The entire game fits
in main memory, and not even all of it.
With MAXFILES 1, there would be room
for a standard DOS 3.3 to load it all
from a file, copy a few pages over the
text page, and jump to the entry point.
The whole boot process wouldn't take
more than 30 seconds (maybe 15 with
Pronto-DOS).

Or we could write a fastloader and
load the entire game in three seconds.

So obviously, we're going to do that.

First, I wrote a little loop to write
all the game code onto tracks $01-$0A
in a standard format. Track $01 will
contain the data that ends up at $0400,
but it will be in memory at $1400
(offset by $1000 for my sanity). It's
slow to write sectors onto a 16-sector
disk in increasing order, but fear not,
this is not how the bootloader will
read them. (More on that in a minute.)

*B000L

; sector count
B000-   A9 9C       LDA   #$9C
B002-   85 FF       STA   $FF
B004-   A9 00       LDA   #$00
B006-   85 FE       STA   $FE

; write 1 sector
B008-   A9 B0       LDA   #$B0
B00A-   A0 88       LDY   #$88
B00C-   20 D9 03    JSR   $03D9

; increment sector and possibly track
B00F-   E6 FE       INC   $FE
B011-   A4 FE       LDY   $FE
B013-   C0 10       CPY   #$10
B015-   D0 07       BNE   $B01E
B017-   A0 00       LDY   #$00
B019-   84 FE       STY   $FE
B01B-   EE 8C B0    INC   $B08C
B01E-   98          TYA
B01F-   8D 8D B0    STA   $B08D

; increment address
B022-   EE 91 B0    INC   $B091

; decrement sector count
B025-   C6 FF       DEC   $FF

; loop until done
B027-   D0 DF       BNE   $B008
B029-   60          RTS

*B088.B097

B088- 01 60 01 00 01 00 FB F7
B090- 00 14 00 00 02 00 00 60

*BSAVE WRITER,A$B000,L$C0

*BLOAD OBJ.0400-08FF,A$1400
*BLOAD OBJ.0900-9FFF,A$1900

[insert a blank disk into slot 6]

*B000G
...write write write...

Now we have the entire game code on
consecutive tracks on a standard disk.

                   ~

               Chapter 6
                 0boot

Once upon a time, I wrote a tiny
bootloader called 4boot. It was fast
and small and I was more than a little
bit proud of it. The boot1 code was a
mere 742 bytes and fit in $BD00..$BFFF.

Then qkumba did that thing he does, and
now it fits in zero page.

With his blessing, I present: 0boot v3.

0boot lives on track $00, just like me.
Sector $00 (boot0) reuses the disk
controller ROM routine to read sector
$0E (boot1). Boot0 creates a few data
tables, copys boot1 to zero page,
modifies it to accomodate booting from
any slot, and jumps to it.

Boot0 is loaded at $0800 by the disk
controller ROM routine.

; tell the ROM to load only this sector
; (we'll do the rest manually)
0800-  [01]

; The accumulator is $01 after loading
; sector $00, or $03 after loading
; sector $0E. We don't need to preserve
; the value, so we just shift the bits
; to determine whether this is the
; first or second time we've been here.
0801-   4A          LSR

; second run -- we've loaded boot1, so
; skip to boot1 initialization routine
0802-   D0 0D       BNE   $0811

; first run -- increment the physical
; sector to read (this will be the next
; sector under the drive head, so we'll
; waste as little time as possible
; waiting for the disk to spin)
0804-   E6 3D       INC   $3D

; X holds the boot slot (x16) --
; munge it into $Cx format (e.g. $C6
; for slot 6, but we need to accomodate
; booting from any slot)
0806-   8A          TXA
0807-   20 7B F8    JSR   $F87B
080A-   09 C0       ORA   #$C0

; push address (-1) of the sector read
; routine in the disk controller ROM
080C-   48          PHA
080D-   A9 5B       LDA   #$5B
080F-   48          PHA

; "return" via disk controller ROM,
; which reads boot1 into $0900 and
; exits via $0801
0810-   60          RTS

; Execution continues here (from $0802)
; after boot1 code has been loaded into
; $0900. On real Apple hardware, the Y
; register is always 0 at $0801, but it
; turns out the CFFA 3000 firmware does
; not always match this behavior --
; which is exactly the sort of bug that
; qkumba enjoys(*) uncovering -- so we
; initialize Y here (to 1, which is the
; value of the accumulator after the
; drive firmware loaded physical sector
; $03 and we performed an LSR).
0811-   A8          TAY

(*) not guaranteed, actual enjoyment
    may vary

; munge the boot slot, e.g. $60 -> $EC
; (to be used later)
0812-   8A          TXA
0813-   09 8C       ORA   #$8C

; Copy the boot1 code from $0901..$09FF
; to zero page. ($0900 holds the 0boot
; version number. This is version 3.
; $0000 is initialized later in boot1.)
0815-   BE 00 09    LDX   $0900,Y
0818-   96 00       STX   $00,Y
081A-   C8          INY
081B-   D0 F8       BNE   $0815

; There are a number of places in boot1
; that need to hit a slot-specific soft
; switch (read a nibble from disk, turn
; off the drive, &c). Rather than the
; usual form of "LDA $C08C,X", we will
; use "LDA $C0EC" and modify the $EC
; byte in advance, based on the boot
; slot. $00E4 is an array of all the
; places in the boot1 code that need
; this adjustment.
081D-   C8          INY
081E-   B6 E0       LDX   $E0,Y
0820-   95 00       STA   $00,X
0822-   D0 F9       BNE   $081D

; munge $EC -> $E0 (used later to
; advance the drive head to the next
; track)
0824-   29 F0       AND   #$F0
0826-   85 CB       STA   $CB

; munge $E0 -> $E8 (used later to
; turn off the drive motor)
0828-   09 08       ORA   #$08
082A-   85 D3       STA   $D3

; push sector interleave array to the
; bottom of the stack (by setting the
; stack pointer to #$0F and pushing
; #$10 bytes, those bytes will end up
; in $0100..$010F)
082C-   A2 0F       LDX   #$0F
082E-   9A          TXS
082F-   BD 90 08    LDA   $0890,X
0832-   48          PHA
0833-   CA          DEX
0834-   10 F9       BPL   $082F

For reference, this is the sector
interleave array:

0890- 00 07 0E 06 0D 05 0C 04
0898- 0B 03 0A 02 09 01 08 0F

; push the game entry point ($600 - 1)
0836-   A9 05       LDA   #$05
0838-   48          PHA
0839-   A9 FF       LDA   #$FF
083B-   48          PHA

; push several addresses to the
; stack (more on this later)
083C-   A2 06       LDX   #$06
083E-   B5 DA       LDA   $DA,X
0840-   48          PHA
0841-   CA          DEX
0842-   D0 FA       BNE   $083E

; number of tracks to load (x2) --
; this game uses $0A tracks
0844-   A0 14       LDY   #$14

; loop starts here
0846-   8A          TXA

; the carry was set by the "LSR" at
; $0801, so we won't take this branch
; the first time (but, as we will see
; shortly, the carry gets flipped off
; and on, and we end up taking this
; branch every second time through the
; loop)
0847-   90 03       BCC   $084C

; X was 0 going into this loop, and it
; never changes, so A will be 0 too.
; So this will push $0000 to the stack
; (to "return" to $0001, which reads a
; track into memory)
0849-   48          PHA
084A-   48          PHA

; There's a "SEC" hidden here (because
; it's opcode $38), but it's only
; executed if we take the branch at
; $0847, which lands at $085C, which is
; in the middle of this instruction.
; Otherwise we execute the compare,
; which clears the carry bit because A
; is always #$00 at this point. So the
; carry flip-flops between set and
; clear, so the BCC at $0847 is only
; taken every other time. Please clap.
084B-   C9 38       CMP   #$38

; Push $00B6 to the stack, to "return"
; to $00B7. This routine advances the
; drive head to the next half track.
084D-   48          PHA
084E-   A9 B6       LDA   #$B6
0850-   48          PHA

; loop until done
0851-   88          DEY
0852-   D0 F2       BNE   $0846

Because of the carry flip-flop, we will
push $00B6 to the stack every time
through the loop, but we will only push
$0000 every other time. The loop runs
for twice the number of tracks we want
to read, so the stack ends up looking
like this:

 --top--
  $00B6 (move drive 1/2 track)
  $00B6 (move drive another 1/2 track)
  $0000 (read track into memory)
  $00B6 \
  $00B6  } second group
  $0000 /
  $00B6 \
  $00B6  } third group
  $0000 /
  .
  . [repeated for each track]
  .
  $00B6 \
  $00B6  } final group
  $0000 /
  $FE88 (IN#0, pushed at $0840)
  $FE92 (PR#0, pushed at $0840)
  $00D1 (turn off drive motor)
  $05FF (game entry point - 1)
--bottom--

Boot1 reads the game into memory from
tracks $01-$0A, but it isn't a loop.
It's one routine that reads a track and
another routine that advances the drive
head. We're essentially unrolling the
read loop on the stack, in advance, so
that each routine gets called as many
times as we need, when we need it. Like
dancers in a chorus line, each routine
executes then cedes the spotlight. Each
seems unaware of the others, but in
reality they've all been meticulously
choreographed.

                   ~

               Chapter 7
                 6 + 2


Before I can explain the next chunk of
code, I need to pause and explain a
little bit of theory. As you probably
know if you're the sort of person who
reads this sort of thing, Apple II
floppy disks do not contain the actual
data that ends up being loaded into
memory. Due to hardware limitations of
the original Disk II drive, data on
disk must be stored in an intermediate
format called "nibbles." Bytes in
memory are encoded into nibbles before
writing to disk, and nibbles that you
read from the disk must be decoded back
into bytes. The round trip is lossless
but requires some bit wrangling.

Decoding nibbles-on-disk into bytes-in-
memory is a multi-step process. In
"6-and-2 encoding" (used by DOS 3.3,
ProDOS, and all ".dsk" image files),
there are 64 possible values that you
may find in the data field (in the
range $96..$FF, but not all of those,
because some of them have bit patterns
that trip up the drive firmware). We'll
call these "raw nibbles."

Step 1: read $156 raw nibbles from the
data field. These values will range
from $96 to $FF, but as mentioned
earlier, not all values in that range
will appear on disk.

Now we have $156 raw nibbles.

Step 2: decode each of the raw nibbles
into a 6-bit byte between 0 and 63
(%00000000 and %00111111 in binary).
$96 is the lowest valid raw nibble, so
it gets decoded to 0. $97 is the next
valid raw nibble, so it's decoded to 1.
$98 and $99 are invalid, so we skip
them, and $9A gets decoded to 2. And so
on, up to $FF (the highest valid raw
nibble), which gets decoded to 63.

Now we have $156 6-bit bytes.

Step 3: split up each of the first $56
6-bit bytes into pairs of bits. In
other words, each 6-bit byte becomes
three 2-bit bytes. These 2-bit bytes
are merged with the next $100 6-bit
bytes to create $100 8-bit bytes. Hence
the name, "6-and-2" encoding.

The exact process of how the bits are
split and merged is... complicated. The
first $56 6-bit bytes get split up into
2-bit bytes, but those two bits get
swapped (so %01 becomes %10 and vice-
versa). The other $100 6-bit bytes each
get multiplied by 4 (a.k.a. bit-shifted
two places left). This leaves a hole in
the lower two bits, which is filled by
one of the 2-bit bytes from the first
group.

A diagram might help. "a" through "x"
each represent one bit.

             -------------

1 decoded      3 decoded
nibble in  +   nibbles in   =  3 bytes
first $56      other $100


00abcdef       00ghijkl
               00mnopqr
   |           00stuvwx
   |
 split            |
   &           shifted
swapped        left x2
   |              |
   V              V

000000fe   +   ghijkl00   =   ghijklfe
000000dc   +   mnopqr00   =   mnopqrdc
000000ba   +   stuvwx00   =   stuvwxba

             -------------

Tada! Four 6-bit bytes

  00abcdef
  00ghijkl
  00mnopqr
  00stuvwx

become three 8-bit bytes

  ghijklfe
  mnopqrdc
  stuvwxba

When DOS 3.3 reads a sector, it reads
the first $56 raw nibbles, decoded them
into 6-bit bytes, and stashes them in a
temporary buffer (at $BC00). Then it
reads the other $100 raw nibbles,
decodes them into 6-bit bytes, and puts
them in another temporary buffer (at
$BB00). Only then does DOS 3.3 start
combining the bits from each group to
create the full 8-bit bytes that will
end up in the target page in memory.
This is why DOS 3.3 "misses" sectors
when it's reading, because it's busy
twiddling bits while the disk is still
spinning.

                   ~

               Chapter 8
             Shift Happens


0boot also uses "6-and-2" encoding. The
first $56 nibbles in the data field are
still split into pairs of bits that
need to be merged with nibbles that
won't come until later. But instead of
waiting for all $156 raw nibbles to be
read from disk, it "interleaves" the
nibble reads with the bit twiddling
required to merge the first $56 6-bit
bytes and the $100 that follow. By the
time 0boot gets to the data field
checksum, it has already stored all
$100 8-bit bytes in their final resting
place in memory. This means that 0boot
can read all 16 sectors on a track in
one revolution of the disk. That's
crazy fast.

To make it possible to do all the bit
twiddling we need to do and not miss
nibbles as the disk spins(*), we do
some of the work earlier. We multiply
each of the 64 possible decoded values
by 4 and store those values. (Since
this is accomplished by bit shifting
and we're doing it before we start
reading the disk, this is called the
"pre-shift" table.) We also store all
possible 2-bit values in a repeating
pattern that will make it easy to look
them up later. Then, as we're reading
from disk (and timing is tight), we can
simulate all the bit math we need to do
with a series of table lookups. There
is just enough time to convert each raw
nibble into its final 8-bit byte before
reading the next nibble.

(*) The disk spins independently of the
    CPU, and we only have a limited
    time to read a nibble and do what
    we're going to do with it before
    OMG HERE COMES ANOTHER ONE. So time
    is of the essence. Also, "As The
    Disk Spins" would make a great name
    for a retrocomputing-themed soap
    opera.

The first table, at $0200..$02FF, is
three columns wide and 64 rows deep.
Astute readers will notice that 3 x 64
is not 256. Only three of the columns
are used; the fourth (unused) column
exists because multiplying by 3 is hard
but multiplying by 4 is easy (in base 2
anyway). The three columns correspond
to the three pairs of 2-bit values in
those first $56 6-bit bytes. Since the
values are only 2 bits wide, each
column holds one of four different
values (%00, %01, %10, or %11).

The second table, at $036C..$03D5, is
the "pre-shift" table. This contains
all the possible 6-bit bytes, in order,
each multiplied by 4 (a.k.a. shifted to
the left two places, so the 6 bits that
started in columns 0-5 are now in
columns 2-7, and columns 0 and 1 are
zeroes). Like this:

       00ghijkl   -->   ghijkl00

Astute readers will notice that there
are only 64 possible 6-bit bytes, but
this second table is larger than 64
bytes. To make lookups easier, the
table has empty slots for each of the
invalid raw nibbles. In other words, we
don't do any math to decode raw nibbles
into 6-bit bytes; we just look them up
in this table (offset by $96, since
that's the lowest valid raw nibble) and
get the required bit shifting for free.


addr | raw |  decoded 6-bit | pre-shift
-----+-----+----------------+----------
$36C | $96 |  0 = %00000000 | %00000000
$36D | $97 |  1 = %00000001 | %00000100
$36E | $98        [invalid raw nibble]
$36F | $99        [invalid raw nibble]
$370 | $9A |  2 = %00000010 | %00001000
$371 | $9B |  3 = %00000011 | %00001100
$372 | $9C        [invalid raw nibble]
$373 | $9D |  4 = %00000100 | %00010000
  .
  .
  .
$3D4 | $FE | 62 = %00111110 | %11111000
$3D5 | $FF | 63 = %00111111 | %11111100


Each value in this "pre-shift" table
also serves as an index into the first
table (with all the 2-bit bytes). This
wasn't an accident; I mean, that sort
of magic doesn't just happen. But the
table of 2-bit bytes is arranged in
such a way that we take one of the raw
nibbles that needs to be decoded and
split apart (from the first $56 raw
nibbles in the data field), use that
raw nibble as an index into the pre-
shift table, then use that pre-shifted
value as an index into the first table
to get the 2-bit value we need. That's
a neat trick.

; this loop creates the pre-shift table
; at $36C
0854-   A2 6A       LDX   #$6A
0856-   1E 6B 03    ASL   $036B,X
0859-   1E 6B 03    ASL   $036B,X
085C-   CA          DEX
085D-   D0 F7       BNE   $0856

Wait, what?

It turns out the drive firmware already
creates a table that looks very similar
to the pre-shift table we want... it's
just not shifted yet! Since we're not
calling the drive firmware anymore, we
can take full advantage of this table
that's guaranteed to be in memory.

And this is the result (".." means the
address is unused):

036C-             00 04 .. ..
0370- 08 0C .. 10 14 18 .. ..
0378- .. .. .. .. 1C 20 .. ..
0380- .. 24 28 2C 30 34 .. ..
0388- 38 3C 40 44 48 4C .. 50
0390- 54 58 5C 60 64 68 .. ..
0398- .. .. .. .. .. .. .. ..
03A0- .. 6C .. 70 74 78 .. ..
03A8- .. 7C .. .. 80 84 .. 88
03B0- 8C 90 94 98 9C A0 .. ..
03B8- .. .. .. A4 A8 AC .. B0
03C0- B4 B8 BC C0 C4 C8 .. ..
03C8- CC D0 D4 D8 DC E0 .. E4
03D0- E8 EC F0 F4 F8 FC

; this loop creates the table of 2-bit
; values at $200, magically arranged to
; enable easy lookups later
085F-   C8          INY
0860-   46 BA       LSR   $BA
0862-   46 BA       LSR   $BA
0864-   B5 E7       LDA   $E7,X
0866-   99 FF 01    STA   $01FF,Y
0869-   E6 AF       INC   $AF
086B-   A5 AF       LDA   $AF
086D-   25 BA       AND   $BA
086F-   D0 05       BNE   $0876
0871-   E8          INX
0872-   8A          TXA
0873-   29 03       AND   #$03
0875-   AA          TAX
0876-   C8          INY
0877-   C8          INY
0878-   C8          INY
0879-   C8          INY
087A-   C0 04       CPY   #$04
087C-   B0 E6       BCS   $0864
087E-   C8          INY
087F-   C0 04       CPY   #$04
0881-   90 DD       BCC   $0860

And this is the result:

0200- 00 00 00 .. 00 00 02 ..
0208- 00 00 01 .. 00 00 03 ..
0210- 00 02 00 .. 00 02 02 ..
0218- 00 02 01 .. 00 02 03 ..
0220- 00 01 00 .. 00 01 02 ..
0228- 00 01 01 .. 00 01 03 ..
0230- 00 03 00 .. 00 03 02 ..
0238- 00 03 01 .. 00 03 03 ..
0240- 02 00 00 .. 02 00 02 ..
0248- 02 00 01 .. 02 00 03 ..
0250- 02 02 00 .. 02 02 02 ..
0258- 02 02 01 .. 02 02 03 ..
0260- 02 01 00 .. 02 01 02 ..
0268- 02 01 01 .. 02 01 03 ..
0270- 02 03 00 .. 02 03 02 ..
0278- 02 03 01 .. 02 03 03 ..
0280- 01 00 00 .. 01 00 02 ..
0288- 01 00 01 .. 01 00 03 ..
0290- 01 02 00 .. 01 02 02 ..
0298- 01 02 01 .. 01 02 03 ..
02A0- 01 01 00 .. 01 01 02 ..
02A8- 01 01 01 .. 01 01 03 ..
02B0- 01 03 00 .. 01 03 02 ..
02B8- 01 03 01 .. 01 03 03 ..
02C0- 03 00 00 .. 03 00 02 ..
02C8- 03 00 01 .. 03 00 03 ..
02D0- 03 02 00 .. 03 02 02 ..
02D8- 03 02 01 .. 03 02 03 ..
02E0- 03 01 00 .. 03 01 02 ..
02E8- 03 01 01 .. 03 01 03 ..
02F0- 03 03 00 .. 03 03 02 ..
02F8- 03 03 01 .. 03 03 03 ..

; to reproduce the experience of the
; original disk, we switch to hi-res
; page 1 and let the title page load
; progressively during boot
0883-   2C 50 C0    BIT   $C050
0886-   2C 54 C0    BIT   $C054
0889-   2C 57 C0    BIT   $C057
088C-   2C 52 C0    BIT   $C052

; And that's all she wrote. Everything
; else is already lined up on the
; stack. All that's left to do is
; "return" and let the stack guide us
; through the rest of the boot.
088F-   60          RTS

[Note to future self: $08A0..$08FF is
 available for game-specific init code,
 but it can't rely on or disturb zero
 page in any way. That rules out a lot
 of built-in ROM routines; be careful.]

                   ~

               Chapter 9
              0boot boot1


The rest of the boot runs from zero
page. It's hard to show you exactly
what boot1 will look like, because it
relies heavily on self-modifying code.

In a standard DOS 3.3 RWTS, the
softswitch to read the data latch is
"LDA $C08C,X", where X is the boot slot
times 16 (to allow disks to boot from
any slot). 0boot also supports booting
from any slot, but instead of using an
index, each fetch instruction is pre-
set based on the boot slot. Not only
does this free up the X register, it
lets us juggle all the registers and
put the raw nibble value in whichever
one is convenient at the time. (We take
full advantage of this freedom.) I've
marked each pre-set softswitch with
"o_O" to remind you that self-modifying
code is awesome.

There are several other instances of
addresses and constants that get
modified while boot1 is running. I've
marked these with "/!\" to remind you
that self-modifying code is dangerous
and you should not try this at home.

The first thing popped off the stack is
the drive arm move routine at $00B7. It
moves the drive exactly one phase (half
a track).

00B7-   E6 BA       INC   $BA

; This value was set at $00B7 (above).
; It's incremented monotonically, but
; it's ANDed with $03 later, so its
; exact value isn't relevant.
00B9-   A0 3F       LDY   #$3F      /!\

; short wait for PHASEON
00BB-   A9 04       LDA   #$04
00BD-   20 C3 00    JSR   $00C3

; fall through
00C0-   88          DEY

; longer wait for PHASEOFF
00C1-   69 41       ADC   #$41
00C3-   85 CE       STA   $CE

; calculate the proper stepper motor to
; access
00C5-   98          TYA
00C6-   29 03       AND   #$03
00C8-   2A          ROL
00C9-   AA          TAX

; This address was set at $0826,
; based on the boot slot.
00CA-   BD E0 C0    LDA   $C0E0,X   /!\

; This value was set at $00C3 so that
; PHASEON and PHASEOFF have optimal
; wait times.
00CD-   A9 D1       LDA   #$D1      /!\

; wait exactly the right amount of time
; after accessing the proper stepper
; motor
00CF-   4C A8 FC    JMP   $FCA8

Since the drive arm routine only moves
one phase, it was pushed to the stack
twice before each track read. Our game
is stored on whole tracks; this half-
track trickery is only to save a few
bytes of code in boot1. (Hey, we're on
zero page; space is tight!)

The track read routine starts at $0001,
because that let us save 1 byte in the
boot0 code when we were pushing
addresses to the stack. (We could just
push $00 twice.)

; sectors-left-to-read-on-this-track
; counter (incremented to #$00)
0001-   A2 F0       LDX   #$F0
0003-   86 00       STX   $00

We initialize an array at $00EB that
tracks which sectors we've read from
the current track. Astute readers will
notice that this part of zero page had
real data in it -- some addresses that
were pushed to the stack, and some
other values that were used to create
the 2-bit table at $0200. All true, but
all those operations are now complete,
and the space is now available for
unrelated uses.

The array is in logical sector order;
we convert physical to logical sectors
immediately after reading the address
field. Values are the actual pages in
memory where that sector should go, and
they get zeroed once the sector is read
(so we don't waste time decoding the
same sector twice).

; starting address (game-specific;
; this one starts loading at $0400)
0005-   A9 04       LDA   #$04      /!\
0007-   95 EB       STA   $EB,X
0009-   E6 06       INC   $06
000B-   E8          INX
000C-   D0 F7       BNE   $0005

000E-   20 D5 00    JSR   $00D5

; subroutine reads a nibble and
; stores it in the accumulator
00D5-   AD EC C0    LDA   $C0EC     o_O
00D8-   10 FB       BPL   $00D5
00DA-   60          RTS

Continuing from $0011...

; first nibble must be $D5
0011-   C9 D5       CMP   #$D5
0013-   D0 F9       BNE   $000E

; read second nibble, must be $AA
0015-   20 D5 00    JSR   $00D5
0018-   C9 AA       CMP   #$AA
001A-   D0 F5       BNE   $0011

; We actually need the Y register to be
; $AA for unrelated reasons later, so
; let's set that now. (We have time,
; and it saves 1 byte!)
001C-   A8          TAY

; read the third nibble
001D-   20 D5 00    JSR   $00D5

; is it $AD?
0020-   49 AD       EOR   #$AD

; Yes, which means this is the data
; prologue. Branch forward to start
; reading the data field.
0022-   F0 22       BEQ   $0046

If that third nibble is not $AD, we
assume it's the end of the address
prologue. ($96 would be the third
nibble of a standard address prologue,
but we don't actually check.) We fall
through and start decoding the 4-4
encoded values in the address field.

0024-   A0 02       LDY   #$02

The first time through this loop,
we'll read the disk volume number.
The second time, we'll read the track
number. The third time, we'll read
the physical sector number. We don't
actually care about the disk volume or
the track number, and once we get the
sector number, we don't verify the
address field checksum. YOLO.

0026-   20 D5 00    JSR   $00D5
0029-   2A          ROL
002A-   85 AF       STA   $AF
002C-   20 D5 00    JSR   $00D5
002F-   25 AF       AND   $AF
0031-   88          DEY
0032-   10 F2       BPL   $0026

; take physical sector number (in A)
; and use it to look up the logical
; sector number
0034-   AA          TAX
0035-   BC 00 01    LDY   $0100,X

; store logical sector number
0038-   84 AF       STY   $AF

; use logical sector number as an
; index into the sector address array
; to get the target page (where we want
; to store this sector in memory)
003A-   B6 DB       LDX   $DB,Y

; store the target page in several
; places throughout the following code
003C-   86 9E       STX   $9E
003E-   CA          DEX
003F-   86 6E       STX   $6E
0041-   86 86       STX   $86
0043-   E8          INX

; This is an unconditional branch,
; because the ROL at $0029 will always
; set the carry. We're done processing
; the address field, so we need to loop
; back and wait for the data prologue.
0044-   B0 C8       BCS   $000E

; execution continues here (from $0022)
; after matching the data prologue
0046-   E0 00       CPX   #$00

; If X is still #$00, it means we found
; a data prologue before we found an
; address prologue. In that case, we
; have to skip this sector, because we
; don't know which sector it is and we
; wouldn't know where to put it.
0048-   F0 C4       BEQ   $000E

Nibble loop #1 reads nibbles $00..$55,
looks up the corresponding offset in
the preshift table at $036C, and stores
that offset in the temporary buffer at
$0300.

; initialize rolling checksum to #$00
004A-   85 58       STA   $58
004C-   AE EC C0    LDX   $C0EC     o_O
004F-   10 FB       BPL   $004C

; The nibble value is in the X register
; now. The lowest possible nibble value
; is $96 and the highest is $FF. To
; look up the offset in the table at
; $036C, we need to subtract #$96 from
; $036C and add X.
0051-   BD D6 02    LDA   $02D6,X

; Now the accumulator has the offset
; into the table of individual 2-bit
; combinations ($0200..$02FF). Store
; that offset in the temporary buffer
; at $0300, in the order we read the
; nibbles. But the Y register started
; counting at $AA, so we need to
; subtract $AA from $0300 and add Y.
0054-   99 56 02    STA   $0256,Y

; The EOR value is set at $004A
; each time through loop #1.
0057-   49 00       EOR   #$00      /!\
0059-   C8          INY
005A-   D0 EE       BNE   $004A

Here endeth nibble loop #1.

Nibble loop #2 reads nibbles $56..$AB,
combines them with bits 0-1 of the
appropriate nibble from the first $56,
and stores them in bytes $00..$55 of
the target page in memory.

005C-   A0 AA       LDY   #$AA
005E-   AE EC C0    LDX   $C0EC     o_O
0061-   10 FB       BPL   $005E
0063-   5D D6 02    EOR   $02D6,X
0066-   BE 56 02    LDX   $0256,Y
0069-   5D 02 02    EOR   $0202,X

; This address was set at $003F
; based on the target page (minus 1
; so we can add Y from $AA..$FF).
006C-   99 56 D1    STA   $D156,Y   /!\
006F-   C8          INY
0070-   D0 EC       BNE   $005E

Here endeth nibble loop #2.

Nibble loop #3 reads nibbles $AC..$101,
combines them with bits 2-3 of the
appropriate nibble from the first $56,
and stores them in bytes $56..$AB of
the target page in memory.

0072-   29 FC       AND   #$FC
0074-   A0 AA       LDY   #$AA
0076-   AE EC C0    LDX   $C0EC     o_O
0079-   10 FB       BPL   $0076
007B-   5D D6 02    EOR   $02D6,X
007E-   BE 56 02    LDX   $0256,Y
0081-   5D 01 02    EOR   $0201,X

; This address was set at $0041
; based on the target page (minus 1
; so we can add Y from $AA..$FF).
0084-   99 AC D1    STA   $D1AC,Y   /!\
0087-   C8          INY
0088-   D0 EC       BNE   $0076

Here endeth nibble loop #3.

Loop #4 reads nibbles $102..$155,
combines them with bits 4-5 of the
appropriate nibble from the first $56,
and stores them in bytes $AC..$FF of
the target page in memory.

008A-   29 FC       AND   #$FC
008C-   A2 AC       LDX   #$AC
008E-   AC EC C0    LDY   $C0EC     o_O
0091-   10 FB       BPL   $008E
0093-   59 D6 02    EOR   $02D6,Y
0096-   BC 54 02    LDY   $0254,X
0099-   59 00 02    EOR   $0200,Y

; This address was set at $003C
; based on the target page.
009C-   9D 00 D1    STA   $D100,X   /!\
009F-   E8          INX
00A0-   D0 EC       BNE   $008E

Here endeth nibble loop #4.

; Finally, get the last nibble,
; which is the checksum of all
; the previous nibbles.
00A2-   29 FC       AND   #$FC
00A4-   AC EC C0    LDY   $C0EC     o_O
00A7-   10 FB       BPL   $00A4
00A9-   59 D6 02    EOR   $02D6,Y

; If checksum fails, start over.
; Note: we really want to branch
; to $000E, but that's too far,
; so we're branching to an earlier
; unrelated "BCS" which branches
; to $000E. The carry is always
; set at this point (it was set
; by the "CPX #$00" all the way
; back at $0046), so the BCS is
; an unconditional jump and we
; end up where we want (at $000E).
00AC-   D0 96       BNE   $0044

; This was set to the logical
; sector number (at $0038), so
; this is a index into the 16-
; byte array at $00DB.
00AE-   A0 00       LDY   #$00      /!\

; store #$00 at this index in the
; sector array to indicate that
; we've read this sector
00B0-   96 DB       STX   $DB,Y

; are we done yet?
00B2-   E6 00       INC   $00

; nope, loop back to read more sectors
00B4-   D0 8E       BNE   $0044

; And that's all she read.
00B6-   60          RTS

0boot's track read routine is done when
$0000 hits $00, which is astonishingly
beautiful. Like, "now I know God" level
of beauty.

And so it goes: we pop another address
off the stack, move the drive arm, read
another track, and so on. Eventually we
finish moving and reading, moving and
reading, and we get to the home stretch
and start calling ROM routines.

  $FE88 (IN#0, pushed at $0840)
  $FE92 (PR#0, pushed at $0840)

Next on the stack:

  $00D1 (turn off drive motor)

00D2-   AD E8 C0    LDA   $C0E8     /!\

Note that this routine falls through to
the one at $00D5 which reads a nibble
from disk, but that's harmless.

Next on the stack is the game's entry
point:

  $05FF

...which jumps to $0600 and starts the
game.

The entire boot process takes about
three seconds.

Quod erat liberandum.

                   ~

           Acknowledgements


Thanks to qkumba for writing 0boot, for
explaining 6-and-2 encoding to me, and
for being that rare combination of
smart and kind.

                   ~

               Changelog

2020-06-24

- another typo (incorrect address in
  comment)

2020-06-23

- typo in the 6-and-2 encoding diagram
  [thanks to Andrew R. for pointing out
  this typo, which has been present in
  every such diagram for the past five
  years, which I now get to correct]

2020-06-23

- initial release

---------------------------------------
A 4am crack                    No. 2204
------------------EOF------------------
